13:34
2026-06-27
lesswrong.com
ai-safety
Flipping the eval on its head
A new approach to cybersecurity evaluations proposes using language models as constant red-team oracles to empirically compare the attack surfaces of different software implementations, such as OpenSSβ¦